Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Sourav Angre, Ritesh Patil, Prasad Yeole, Mrs. Varsha Dharmadhikari
DOI Link: https://doi.org/10.22214/ijraset.2026.83848
Certificate: View Certificate
Heart disease remains one of the leading causes of mortality worldwide, emphasizing the need for accurate and interpretable predictive systems that can support early diagnosis and clinical decision-making. While machine learning techniques have demonstrated strong predictive capabilities in cardiovascular disease detection, their adoption in healthcare is often limited by the lack of transparency and explainability in model predictions. This study proposes an Explainable Artificial Intelligence (XAI)-driven framework for heart disease prediction using clin-ical parameters and ensemble learning techniques. The frame-work utilizes the UCI Heart Disease Dataset, comprising 1,025 patient records with 13 clinically relevant attributes, including age, chest pain type, cholesterol level, resting blood pressure, maximum heart rate achieved, and exercise-induced angina. Two machine learning models, namely Logistic Regression and Ran-dom Forest, are developed and evaluated for binary classification of heart disease risk. To enhance model transparency and clinical trust, SHapley Additive Explanations (SHAP) are integrated to provide both global and patient-level interpretations of model predictions. Global explanations identify the most influential clinical factors affecting prediction outcomes, while local explanations provide individualized reasoning for specific patient predictions. Ex-perimental results demonstrate that the Random Forest model outperforms Logistic Regression, achieving superior accuracy, precision, recall, and F1-score. Receiver Operating Character-istic (ROC) analysis further confirms the strong discriminative capability of the proposed framework, achieving an Area Under the Curve (AUC) of 0.857. SHAP-based analysis reveals that clinical attributes such as the number of major vessels (ca), chest pain type (cp), thalassemia status (thal), ST depression induced by exercise (oldpeak), and maximum heart rate achieved (thalach) are among the most influ-ential predictors of heart disease. The proposed framework not only delivers reliable predictive performance but also provides interpretable and clinically meaningful explanations, making it a practical decision-support tool for healthcare professionals.
Cardiovascular diseases (CVDs) are among the leading causes of global mortality, creating major health and economic challenges. Early detection of heart disease risk is essential for timely treatment and improved patient outcomes. Traditional diagnosis methods depend on clinical expertise, laboratory tests, and medical imaging, but the increasing availability of healthcare data has encouraged the use of Machine Learning (ML) for automated risk prediction.
Machine learning algorithms such as Logistic Regression, Support Vector Machines, Decision Trees, and Random Forest can identify hidden patterns in clinical data and assist healthcare professionals in decision-making. However, many high-performing ML models lack transparency and operate as black-box systems, reducing clinician trust.
To overcome this limitation, Explainable Artificial Intelligence (XAI) techniques, especially SHapley Additive Explanations (SHAP), are used to provide explanations for model predictions. SHAP identifies the contribution of individual clinical features to prediction outcomes, enabling both overall model interpretation and patient-specific explanations.
The proposed research develops an Explainable AI-based heart disease prediction framework using clinical data from the UCI Heart Disease Dataset. It combines machine learning classification with SHAP-based explainability to achieve accurate, transparent, and clinically meaningful predictions.
The study focuses on:
Early heart disease prediction relied on statistical methods such as Logistic Regression and rule-based systems. These approaches were simple and interpretable but had limitations in handling complex nonlinear relationships between multiple clinical factors.
Common clinical indicators include:
Machine learning has improved disease prediction by analyzing large healthcare datasets and identifying complex patterns. Common algorithms include:
Although these models often provide higher accuracy than traditional methods, many lack interpretability, which is critical in medical applications.
Random Forest is an ensemble learning technique that combines multiple decision trees to improve accuracy and reduce overfitting. It is effective for healthcare applications because it:
Studies show Random Forest often performs better than traditional classifiers in heart disease prediction tasks.
XAI improves trust in machine learning systems by explaining why a model produces a specific prediction. SHAP is a widely used XAI method based on cooperative game theory.
SHAP provides:
This improves model transparency and helps clinicians understand AI-assisted decisions.
Existing heart disease prediction studies mainly focus on improving accuracy while giving limited attention to explainability. Many systems provide only general feature importance rather than patient-level explanations.
The research addresses this gap by combining:
The study uses the UCI Heart Disease Dataset, containing:
Clinical features include:
The problem is treated as a binary classification task.
The preprocessing steps include:
Logistic Regression is used as a baseline classification model. It predicts the probability of heart disease using clinical features.
Advantages:
Limitation:
Random Forest is an ensemble model that combines multiple decision trees to improve prediction accuracy.
Advantages:
This study presented an Explainable AI-driven framework for heart disease prediction using clinical parameters and ensemble learning techniques. The proposed approach utilized the UCI Heart Disease Dataset comprising 1,025 patient records and thirteen clinically relevant attributes to develop a predictive system capable of identifying the presence of heart disease. Two machine learning algorithms, Logistic Regression and Random Forest, were investigated and evaluated using standard classification metrics. Experimental results demonstrated that the Random Forest classifier outperformed Logistic Regression across all eval-uation measures, including accuracy, precision, recall, and F1-score. The model achieved strong predictive performance while maintaining robust generalization capability. Further-more, ROC analysis confirmed the effectiveness of the frame-work, achieving an Area Under the Curve (AUC) value of 0.857, indicating reliable discrimination between patients with and without heart disease. A key contribution of this research is the integration of SHapley Additive Explanations (SHAP) to enhance model transparency and interpretability. Through both global and patient-level explanations, the framework identified clinically significant predictors such as the number of major vessels (ca), chest pain type (cp), thalassemia status (thal), ST de-pression induced by exercise (oldpeak), and maximum heart rate achieved (thalach). These explanations provide valuable insights into the factors influencing prediction outcomes and improve trust in machine learning-assisted clinical decision-making. Unlike conventional black-box prediction systems, the pro-posed framework combines predictive accuracy with explain-ability, enabling healthcare professionals to understand the reasoning behind model decisions. This capability enhances transparency and supports the practical adoption of artifi-cial intelligence in clinical environments. By providing in-terpretable predictions and clinically meaningful explanations, the framework serves as a reliable decision-support tool for heart disease risk assessment. Future work may focus on evaluating the framework using larger and more diverse clinical datasets, incorporating addi-tional machine learning and deep learning models, and imple-menting advanced validation strategies such as cross-validation and external dataset testing. Further improvements may also include real-time deployment within healthcare applications and the integration of additional explainability techniques to strengthen clinical usability and trust.
[1] Dua and C. Graff, “UCI Machine Learning Repository,” University of California, Irvine, School of Information and Computer Sciences, 2019. [2] R. Detrano et al., “International application of a new probability al-gorithm for the diagnosis of coronary artery disease,” The American Journal of Cardiology, vol. 64, no. 5, pp. 304–310, 1989. [3] M. Janosi, W. Steinbrunn, M. Pfisterer, and R. Detrano, “Heart Disease Dataset,” UCI Machine Learning Repository, 1988. [4] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, 2nd ed. New York, NY, USA: Springer, 2009. [5] L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001. [6] C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995. [7] P. K. Anooj, “Clinical decision support system: Risk level prediction of heart disease using weighted fuzzy rules,” Journal of King Saud University – Computer and Information Sciences, vol. 24, no. 1, pp. 27–40, 2012. [8] M. Akhil Jabbar, B. L. Deekshatulu, and P. Chandra, “Heart disease prediction system using associative classification and genetic algorithm,” Procedia Technology, vol. 10, pp. 183–192, 2013. [9] K. Polat and S. Gu¨nes¸, “A hybrid approach to medical decision support system based on principal component analysis and adaptive neuro-fuzzy inference system,” Applied Mathematics and Computation, vol. 189, no. 2, pp. 1533–1544, 2007. [10] M. Chicco and G. Jurman, “Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone,” BMC Medical Informatics and Decision Making, vol. 20, no. 1, 2020. [11] S. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” in Advances in Neural Information Processing Systems (NeurIPS), 2017, pp. 4765–4774. [12] S. M. Lundberg et al., “From local explanations to global understanding with explainable AI for trees,” Nature Machine Intelligence, vol. 2, no. 1, pp. 56–67, 2020. [13] M. T. Ribeiro, S. Singh, and C. Guestrin, “Why should I trust you? Explaining the predictions of any classifier,” in Proc. ACM SIGKDD, 2016, pp. 1135–1144. [14] A. Adadi and M. Berrada, “Peeking inside the black-box: A survey on explainable artificial intelligence (XAI),” IEEE Access, vol. 6, pp. 52138–52160, 2018. [15] D. Gunning and D. Aha, “DARPA’s Explainable Artificial Intelligence (XAI) Program,” AI Magazine, vol. 40, no. 2, pp. 44–58, 2019. [16] A. Holzinger et al., “What do we need to build explainable AI systems for the medical domain?” arXiv preprint arXiv:1712.09923, 2017. [17] A. Rajkomar, J. Dean, and I. Kohane, “Machine learning in medicine,” New England Journal of Medicine, vol. 380, no. 14, pp. 1347–1358, 2019. [18] E. J. Topol, “High-performance medicine: The convergence of human and artificial intelligence,” Nature Medicine, vol. 25, pp. 44–56, 2019. [19] R. Miotto, F. Wang, S. Wang, X. Jiang, and J. T. Dudley, “Deep learning for healthcare: Review, opportunities and challenges,” Briefings in Bioinformatics, vol. 19, no. 6, pp. 1236–1246, 2018. [20] A. Esteva et al., “A guide to deep learning in healthcare,” Nature Medicine, vol. 25, no. 1, pp. 24–29, 2019. [21] J. Pearl, Causality: Models, Reasoning and Inference, 2nd ed. Cam-bridge, UK: Cambridge University Press, 2009. [22] F. Doshi-Velez and B. Kim, “Towards a rigorous science of interpretable machine learning,” arXiv preprint arXiv:1702.08608, 2017.
Copyright © 2026 Sourav Angre, Ritesh Patil, Prasad Yeole, Mrs. Varsha Dharmadhikari. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET83848
Publish Date : 2026-06-20
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here
Submit Paper Online
